Parallel cross-validation: A scalable fitting method for Gaussian process models
نویسندگان
چکیده
Abstract Gaussian process (GP) models are widely used to analyze spatially referenced data and predict values at locations without observations. They based on a statistical framework, which enables uncertainty quantification of the model structure predictions. Both evaluation likelihood prediction involve solving linear systems. Hence, computational costs large limit amount that can be handled. While there many approximation strategies lower cost GP models, they often provide sub-optimal support for parallel computing capabilities (high-performance) environments. To bridge this gap parallelizable parameter estimation method is presented. The key idea divide spatial domain into overlapping subsets use cross-validation (CV) estimate covariance parameters in parallel. Although simulations show CV less effective than maximum method, it amenable handling datasets. Exploiting screen effect helps arrive analysis close global computation despite performing computations local regions. Simulation studies assess accuracy estimates implementation shows good weak strong scaling properties. For illustration, an exponential fitted scientifically relevant canopy height dataset with 5 million Using 512 processor cores brings time one configuration 1.5 minutes.
منابع مشابه
Scalable Inference for Structured Gaussian Process Models
The generic inference and learning algorithm for Gaussian Process (GP) regression has O(N3) runtime and O(N2) memory complexity, where N is the number of observations in the dataset. Given the computational resources available to a present-day workstation, this implies that GP regression simply cannot be run on large datasets. The need to use nonGaussian likelihood functions for tasks such as c...
متن کاملA Parallel Quasi-Newton Method for Gaussian Data Fitting
We describe a parallel method for unconstrained optimization based on the quasi-Newton descent method of Broyden, Fletcher, Goldfarb, and Shanno. Our algorithm is suitable for both single-instruction and multiple-instruction parallel architectures and has only linear memory requirements in the number of parameters used to ®t the data. We also present the results of numerical testing on both sin...
متن کاملScalable Inference for Gaussian Process Models with Black-Box Likelihoods
We propose a sparse method for scalable automated variational inference (AVI) in a large class of models with Gaussian process (GP) priors, multiple latent functions, multiple outputs and non-linear likelihoods. Our approach maintains the statistical efficiency property of the original AVI method, requiring only expectations over univariate Gaussian distributions to approximate the posterior wi...
متن کاملScalable Logit Gaussian Process Classification
We propose an efficient stochastic variational approach to Gaussian Process (GP) classification building on Pólya-Gamma data augmentation and inducing points, which is based on closed-form updates of natural gradients. We evaluate the algorithm on real-world datasets containing up to 11 million data points and demonstrate that it is up to two orders of magnitude faster than the state-of-the-art...
متن کاملScalable Variational Gaussian Process Classification
Gaussian process classification is a popular method with a number of appealing properties. We show how to scale the model within a variational inducing point framework, outperforming the state of the art on benchmark datasets. Importantly, the variational formulation can be exploited to allow classification in problems with millions of data points, as we demonstrate in experiments.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computational Statistics & Data Analysis
سال: 2021
ISSN: ['0167-9473', '1872-7352']
DOI: https://doi.org/10.1016/j.csda.2020.107113